SAP provides the CCC converter program to convert characters from an encoding to another one.
Table of contents
Character encoding (aka code page)
Character encoding is a name ("utf-8", "iso-8859-1", etc.) and an equivalence table with a set of characters and octet values for each of these characters.
Code page is the name that SAP uses instead of character encoding. Code pages have a 4-digit number instead of a character name.
Equivalences between Character encoding international name and SAP code page number
Some SAP programs expect:
- either a 4 characters code: you then have to enter the SAP code page number
- You may find the SAP code page number from the international character encoding name by calling SCP_CODEPAGE_BY_EXTERNAL_NAME function module. Or you may look at TCP00A database table.
- or a 20 characters code: usually, you may either enter character encoding or SAP code page. Usually character encoding case is ignored.
Examples of a few equivalences:
SAP code page | Character encoding international name |
---|---|
124 | IBM EBCDIC 00697/00297 |
1100 | iso-8859-1 |
1105 | US-ASCII (7 bits) |
1160 | windows-1252 |
4102 | utf-16be |
4103 | utf-16le |
4110 | utf-8 |
8000 | Shift-JIS |
8300 | BIG5 |
Usual problems with Character encoding conversion
- Converting from one code page to another may be not possible for all characters of the source code page, because they do not exist in the target codepage.
- For example, converting from big5 (Chinese) to us-ascii makes no sense. If you think that it should be possible, then you probably didn't choose the right .
- In that case, we have to provide a replacement character to the CCC converter
- Sequence of bytes is not recognized as a character in the source code page. It means that:
- either the sender program does not respect the code page (then ask the sender program to correct the error)
- or you should choose another code page (sometimes, differences between code pages are very little)
- or your program has erroneously shortened input bytes, last input byte(s) does mean nothing.
- For example, the 2 only bytes D8 00 mean nothing in utf-16le: 2 following bytes are expected to be able to identify the character (here encoded on 4 bytes).
How to call the CCC converter
CCC converter is a kernel program which may be accessed by several programs:
- CL_ABAP_CODEPAGE class, available since 7.02. The code page cannot be the SAP number, it must be either the "Character encoding international name", or the name as used in java language.
- CL_ABAP_CONV_* classes, since 6.10, where CL_ABAP_CONV_OBJ is the master class which gives full access to CCC converter. There are also these classes which call CCC converter with default values:
- CL_ABAP_CONV_IN_CE: converts bytes representing characters in a given codepage into a character or string variable
- CL_ABAP_CONV_OUT_CE: converts a character or string variable into bytes representing characters in a given codepage
- CL_ABAP_CONV_X2X_CE: converts bytes representing characters in a given codepage, into bytes representing characters in another given codepage
- SCP_TRANSLATE_CHARS function module, works with all releases
Note: CCC stands for Character set Conversion Cache, a memory area where SAP stores the code pages it needs for conversions.
Links
- SDN blog - BSP - a Developer's Journal: Part VII - Dealing with multiple languages (English, German, Spanish, Thai, and Polish), by Thomas Jung
- What is Unicode
- Unicode Transformation Format
- SAP library:
- Internationalization
- Character codes: short explanation of character encoding
- Data conversion: short explanation of conversion possibilities in ABAP
3 Comments
Former Member
Very helpful! Thanks again Sandra!
Paolo Baruffaldi
Extremely helpful!
Thanks Sandra Rossi
Marco SILVA
Hi Sandra,
Thank you for this blog!
I'm trying to create a file in encoding "ISO-8859-15", which should work for example for € (EUR) character, but at the end I get a file in "Windows-1252" encoding and displaying wrong those special characters...
Any clue about what can be wrong?
Best regards,
Marco Silva